NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Locating Information Gaps and Narrative Inconsistencies Across Languages: A Case Study of LGBT People Portrayals on Wikipedia

Samir, Farhan; Park, Chan_Young; Field, Anjalie; Shwartz, Vered; Tsvetkov, Yulia (December 2024, EMNLP)

Full Text Available
Modular Pluralism: Pluralistic Alignment via Multi-LLM Collaboration

Feng, Shangbin; Sorensen, Taylor; Liu, Yuhan; Fisher, Jillian; Park, Chan_Young; Choi, Yejin; Tsvetkov, Yulia (December 2024, EMNLP)

Full Text Available
VALUESCOPE: Unveiling Implicit Norms and Values via Return Potential Model of Social Interactions

Park, Chan_Young; Li, Shuyue_Stella; Jung, Hayoung; Volkova, Svitlana; Mitra, Tanushree; Jurgens, David; Tsvetkov, Yulia (December 2024, EMNLP)

Full Text Available
Gen-Z: Generative Zero-Shot Text Classification with Contextualized Label Descriptions

Kumar, Sachin; Park, Chan_Young; Tsvetkov, Yulia (May 2024, International Conference on Learning Representations)

Language model (LM) prompting—a popular paradigm for solving NLP tasks—has been shown to be susceptible to miscalibration and brittleness to slight prompt variations, caused by its discriminative prompting approach, i.e., predicting the label given the input. To address these issues, we propose Gen-Z—a generative prompting framework for zero-shot text classification. GEN-Z is generative, as it measures the LM likelihood of input text, conditioned on natural language descriptions of labels. The framework is multivariate, as label descriptions allow us to seamlessly integrate additional contextual information about the labels to improve task performance. On various standard classification benchmarks, with six open-source LM families, we show that zero-shot classification with simple contextualization of the data source of the evaluation set consistently outperforms both zero-shot and few-shot baselines while improving robustness to prompt variations. Further, our approach enables personalizing classification in a zero-shot manner by incorporating author, subject, or reader information in the label descriptions.
more » « less
Full Text Available
P3Sum: Preserving Author’s Perspective in News Summarization with Diffusion Language Models

Liu, Yuhan; Feng, Shangbin; Han, Xiaochuang; Balachandran, Vidhisha; Park, Chan_Young; Kumar, Sachin; Tsvetkov, Yulia (June 2024, NAACL)

In this work, we take a first step towards designing summarization systems that are faithful to the author’s intent, not only the semantic content of the article. Focusing on a case study of preserving political perspectives in news summarization, we find that existing approaches alter the political opinions and stances of news articles in more than 50% of summaries, misrepresenting the intent and perspectives of the news authors. We thus propose P3Sum, a diffusion model-based summarization approach controlled by political perspective classifiers. In P3Sum, the political leaning of a generated summary is iteratively evaluated at each decoding step, and any drift from the article’s original stance incurs a loss back-propagated to the embedding layers, steering the political stance of the summary at inference time. Extensive experiments on three news summarization datasets demonstrate that P3Sum outperforms state-of-the-art summarization systems and large language models by up to 13.7% in terms of the success rate of stance preservation, with competitive performance on standard metrics of summarization quality. Our findings present a first analysis of preservation of pragmatic features in summarization, highlight the lacunae in existing summarization models—that even state-of-the-art models often struggle to preserve author’s intents—and develop new summarization systems that are more faithful to author’s perspectives.
more » « less
Full Text Available

Search for: All records